Goto

Collaborating Authors

 comprehensive review


Signal-to-Noise Ratio in Scanning Electron Microscopy: A Comprehensive Review

Sim, K. S., Bukhori, I., Ong, D. C. Y., Gan, K. B.

arXiv.org Artificial Intelligence

Scanning Electron Microscopy (SEM) is critical in nanotechnology, materials science, and biological imaging due to its high spatial resolution and depth of focus. Signal-to-noise ratio (SNR) is an essential parameter in SEM because it directly impacts the quality and interpretability of the images. SEM is widely used in various scientific disciplines, but its utility can be compromised by noise, which degrades image clarity. This review explores multiple aspects of the SEM imaging process, from the principal operation of SEM, sources of noise in SEM, methods for SNR measurement and estimations, to various aspects that affect the SNR measurement and approaches to enhance SNR, both from a hardware and software standpoint. We review traditional and emerging techniques, focusing on their applications, advantages, and limitations. The paper aims to provide a comprehensive understanding of SNR optimization in SEM for researchers and practitioners and to encourage further research in the field.


A Comprehensive Review on Harnessing Large Language Models to Overcome Recommender System Challenges

Raja, Rahul, Vats, Anshaj, Vats, Arpita, Majumder, Anirban

arXiv.org Artificial Intelligence

Recommender systems have traditionally followed modular architectures comprising candidate generation, multi-stage ranking, and re-ranking, each trained separately with supervised objectives and hand-engineered features. While effective in many domains, such systems face persistent challenges including sparse and noisy interaction data, cold-start problems, limited personalization depth, and inadequate semantic understanding of user and item content. The recent emergence of Large Language Models (LLMs) offers a new paradigm for addressing these limitations through unified, language-native mechanisms that can generalize across tasks, domains, and modalities. In this paper, we present a comprehensive technical survey of how LLMs can be leveraged to tackle key challenges in modern recommender systems. We examine the use of LLMs for prompt-driven candidate retrieval, language-native ranking, retrieval-augmented generation (RAG), and conversational recommendation, illustrating how these approaches enhance personalization, semantic alignment, and interpretability without requiring extensive task-specific supervision. LLMs further enable zero- and few-shot reasoning, allowing systems to operate effectively in cold-start and long-tail scenarios by leveraging external knowledge and contextual cues. We categorize these emerging LLM-driven architectures and analyze their effectiveness in mitigating core bottlenecks of conventional pipelines. In doing so, we provide a structured framework for understanding the design space of LLM-enhanced recommenders, and outline the trade-offs between accuracy, scalability, and real-time performance. Our goal is to demonstrate that LLMs are not merely auxiliary components but foundational enablers for building more adaptive, semantically rich, and user-centric recommender systems


Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends

Tami, Mohammad Abu, Elhenawy, Mohammed, Ashqar, Huthaifa I.

arXiv.org Artificial Intelligence

Traffic safety remains a critical global challenge, with traditional Advanced Driver-Assistance Systems (ADAS) often struggling in dynamic real-world scenarios due to fragmented sensor processing and susceptibility to adversarial conditions. This paper reviews the transformative potential of Multimodal Large Language Models (MLLMs) in addressing these limitations by integrating cross-modal data such as visual, spatial, and environmental inputs to enable holistic scene understanding. Through a comprehensive analysis of MLLM-based approaches, we highlight their capabilities in enhancing perception, decision-making, and adversarial robustness, while also examining the role of key datasets (e.g., KITTI, DRAMA, ML4RoadSafety) in advancing research. Furthermore, we outline future directions, including real-time edge deployment, causality-driven reasoning, and human-AI collaboration. By positioning MLLMs as a cornerstone for next-generation traffic safety systems, this review underscores their potential to revolutionize the field, offering scalable, context-aware solutions that proactively mitigate risks and improve overall road safety.


Comprehensive Review of Neural Differential Equations for Time Series Analysis

Oh, YongKyung, Kam, Seungsu, Lee, Jonghun, Lim, Dong-Young, Kim, Sungil, Bui, Alex

arXiv.org Artificial Intelligence

Time series modeling and analysis has become critical in various domains. Conventional methods such as RNNs and Transformers, while effective for discrete-time and regularly sampled data, face significant challenges in capturing the continuous dynamics and irregular sampling patterns inherent in real-world scenarios. Neural Differential Equations (NDEs) represent a paradigm shift by combining the flexibility of neural networks with the mathematical rigor of differential equations. This paper presents a comprehensive review of NDE-based methods for time series analysis, including neural ordinary differential equations, neural controlled differential equations, and neural stochastic differential equations. We provide a detailed discussion of their mathematical formulations, numerical methods, and applications, highlighting their ability to model continuous-time dynamics. Furthermore, we address key challenges and future research directions. This survey serves as a foundation for researchers and practitioners seeking to leverage NDEs for advanced time series analysis.


Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities

Mobbs, Rebecca, Makris, Dimitrios, Argyriou, Vasileios

arXiv.org Artificial Intelligence

Emotion recognition and generation have emerged as crucial topics in Artificial Intelligence research, playing a significant role in enhancing human-computer interaction within healthcare, customer service, and other fields. Although several reviews have been conducted on emotion recognition and generation as separate entities, many of these works are either fragmented or limited to specific methodologies, lacking a comprehensive overview of recent developments and trends across different modalities. In this survey, we provide a holistic review aimed at researchers beginning their exploration in emotion recognition and generation. We introduce the fundamental principles underlying emotion recognition and generation across facial, vocal, and textual modalities. This work categorises recent state-of-the-art research into distinct technical approaches and explains the theoretical foundations and motivations behind these methodologies, offering a clearer understanding of their application. Moreover, we discuss evaluation metrics, comparative analyses, and current limitations, shedding light on the challenges faced by researchers in the field. Finally, we propose future research directions to address these challenges and encourage further exploration into developing robust, effective, and ethically responsible emotion recognition and generation systems.


A Comprehensive Review: Applicability of Deep Neural Networks in Business Decision Making and Market Prediction Investment

Trinh, Viet

arXiv.org Artificial Intelligence

Big data, both in its structured and unstructured formats, have brought in unforeseen challenges in economics and business. How to organize, classify, and then analyze such data to obtain meaningful insights are the ever-going research topics for business leaders and academic researchers. This paper studies recent applications of deep neural networks in decision making in economical business and investment; especially in risk management, portfolio optimization, and algorithmic trading. Set aside limitation in data privacy and cross-market analysis, the article establishes that deep neural networks have performed remarkably in financial classification and prediction. Moreover, the study suggests that by compositing multiple neural networks, spanning different data type modalities, a more robust, efficient, and scalable financial prediction framework can be constructed.


Multimodal Marvels of Deep Learning in Medical Diagnosis: A Comprehensive Review of COVID-19 Detection

Islama, Md Shofiqul, Hasanc, Khondokar Fida, Shajeebd, Hasibul Hossain, Ranae, Humayan Kabir, Rahmand, Md Saifur, Hasanb, Md Munirul, Azadf, AKM, Abdullahg, Ibrahim, Moni, Mohammad Ali

arXiv.org Artificial Intelligence

This study presents a comprehensive review of the potential of multimodal deep learning (DL) in medical diagnosis, using COVID-19 as a case example. Motivated by the success of artificial intelligence applications during the COVID-19 pandemic, this research aims to uncover the capabilities of DL in disease screening, prediction, and classification, and to derive insights that enhance the resilience, sustainability, and inclusiveness of science, technology, and innovation systems. Adopting a systematic approach, we investigate the fundamental methodologies, data sources, preprocessing steps, and challenges encountered in various studies and implementations. We explore the architecture of deep learning models, emphasising their data-specific structures and underlying algorithms. Subsequently, we compare different deep learning strategies utilised in COVID-19 analysis, evaluating them based on methodology, data, performance, and prerequisites for future research. By examining diverse data types and diagnostic modalities, this research contributes to scientific understanding and knowledge of the multimodal application of DL and its effectiveness in diagnosis. We have implemented and analysed 11 deep learning models using COVID-19 image, text, and speech (ie, cough) data. Our analysis revealed that the MobileNet model achieved the highest accuracy of 99.97% for COVID-19 image data and 93.73% for speech data (i.e., cough). However, the BiGRU model demonstrated superior performance in COVID-19 text classification with an accuracy of 99.89%. The broader implications of this research suggest potential benefits for other domains and disciplines that could leverage deep learning techniques for image, text, and speech analysis.


Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges

Niu, Qian, Liu, Junyu, Bi, Ziqian, Feng, Pohsun, Peng, Benji, Chen, Keyu, Li, Ming

arXiv.org Artificial Intelligence

This comprehensive review explores the intersection of Large Language Models (LLMs) and cognitive science, examining similarities and differences between LLMs and human cognitive processes. We analyze methods for evaluating LLMs cognitive abilities and discuss their potential as cognitive models. The review covers applications of LLMs in various cognitive fields, highlighting insights gained for cognitive science research. We assess cognitive biases and limitations of LLMs, along with proposed methods for improving their performance. The integration of LLMs with cognitive architectures is examined, revealing promising avenues for enhancing artificial intelligence (AI) capabilities. Key challenges and future research directions are identified, emphasizing the need for continued refinement of LLMs to better align with human cognition. This review provides a balanced perspective on the current state and future potential of LLMs in advancing our understanding of both artificial and human intelligence.


Integrative Approaches in Cybersecurity and AI

Omar, Marwan

arXiv.org Artificial Intelligence

In recent years, the convergence of cybersecurity, artificial intelligence (AI), and data management has emerged as a critical area of research, driven by the increasing complexity and interdependence of modern technological ecosystems. This paper provides a comprehensive review and analysis of integrative approaches that harness AI techniques to enhance cybersecurity frameworks and optimize data management practices. By exploring the synergies between these domains, we identify key trends, challenges, and future directions that hold the potential to revolutionize the way organizations protect, analyze, and leverage their data. Our findings highlight the necessity of cross-disciplinary strategies that incorporate AI-driven automation, real-time threat detection, and advanced data analytics to build more resilient and adaptive security architectures.


Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning

Zhang, Xiaodan, Vemulapalli, Sandeep, Talukdar, Nabasmita, Ahn, Sumyeong, Wang, Jiankun, Meng, Han, Murtaza, Sardar Mehtab Bin, Dave, Aakash Ajay, Leshchiner, Dmitry, Joseph, Dimitri F., Witteveen-Lane, Martin, Chesla, Dave, Zhou, Jiayu, Chen, Bin

arXiv.org Artificial Intelligence

This study assesses the ability of state-of-the-art large language models (LLMs) including GPT-3.5, GPT-4, Falcon, and LLaMA 2 to identify patients with mild cognitive impairment (MCI) from discharge summaries and examines instances where the models' responses were misaligned with their reasoning. Utilizing the MIMIC-IV v2.2 database, we focused on a cohort aged 65 and older, verifying MCI diagnoses against ICD codes and expert evaluations. The data was partitioned into training, validation, and testing sets in a 7:2:1 ratio for model fine-tuning and evaluation, with an additional metastatic cancer dataset from MIMIC III used to further assess reasoning consistency. GPT-4 demonstrated superior interpretative capabilities, particularly in response to complex prompts, yet displayed notable response-reasoning inconsistencies. In contrast, open-source models like Falcon and LLaMA 2 achieved high accuracy but lacked explanatory reasoning, underscoring the necessity for further research to optimize both performance and interpretability. The study emphasizes the significance of prompt engineering and the need for further exploration into the unexpected reasoning-response misalignment observed in GPT-4. The results underscore the promise of incorporating LLMs into healthcare diagnostics, contingent upon methodological advancements to ensure accuracy and clinical coherence of AI-generated outputs, thereby improving the trustworthiness of LLMs for medical decision-making.